Deep Learning of Local RGB-D Patches for 3D Object Detection and 6D Pose Estimation
نویسندگان
چکیده
We present a 3D object detection method that uses regressed descriptors of locally-sampled RGB-D patches for 6D vote casting. For regression, we employ a convolutional auto-encoder that has been trained on a large collection of random local patches. During testing, scene patch descriptors are matched against a database of synthetic model view patches and cast 6D object votes which are subsequently filtered to refined hypotheses. We evaluate on three datasets to show that our method generalizes well to previously unseen input data, delivers robust detection results that compete with and surpass the state-of-the-art while being scalable in the number of objects.
منابع مشابه
Deep-6DPose: Recovering 6D Object Pose from a Single RGB Image
Detecting objects and their 6D poses from only RGB images is an important task for many robotic applications. While deep learning methods have made significant progress in visual object detection and segmentation, the object pose estimation task is still challenging. In this paper, we introduce an end-toend deep learning framework, named Deep-6DPose, that jointly detects, segments, and most imp...
متن کاملThe Best of Both Worlds: Learning Geometry-based 6D Object Pose Estimation
We address the task of estimating the 6D pose of known rigid objects, from RGB and RGB-D input images, in scenarios where the objects are heavily occluded. Our main contribution is a new modular processing pipeline. The first module localizes all known objects in the image via an existing instance segmentation network. The next module densely regresses the object surface positions in its local ...
متن کاملFirst-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations
In this work we study the use of 3D hand poses to recognize first-person hand actions interacting with 3D objects. Towards this goal, we collected RGB-D video sequences of more than 100K frames of 45 daily hand action categories, involving 25 different objects in several hand grasp configurations1. To obtain high quality hand pose annotations from real sequences, we used our own mo-cap system t...
متن کامل6D Object Detection and Next-Best-View Prediction in the Crowd
6D object detection and pose estimation in the crowd (scenes with multiple object instances, severe foreground occlusions and background distractors), has become an important problem in many rapidly evolving technological areas such as robotics and augmented reality. Single shotbased 6D pose estimators with manually designed features are still unable to tackle the above challenges, motivating t...
متن کاملDeepIM: Deep Iterative Matching for 6D Pose Estimation
Estimating the 6D pose of objects from images is an important problem in various applications such as robot manipulation and virtual reality. While direct regression of images to object poses has limited accuracy, matching rendered images of an object against the input image can produce accurate results. In this work, we propose a novel deep neural network for 6D pose matching named DeepIM. Giv...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016